Oracle Inequalities and Optimal Inference under Group Sparsity
نویسندگان
چکیده
We consider the problem of estimating a sparse linear regression vector β∗ under a gaussian noise model, for the purpose of both prediction and model selection. We assume that prior knowledge is available on the sparsity pattern, namely the set of variables is partitioned into prescribed groups, only few of which are relevant in the estimation process. This group sparsity assumption suggests us to consider the Group Lasso method as a means to estimate β∗. We establish oracle inequalities for the prediction and l2 estimation errors of this estimator. These bounds hold under a restricted eigenvalue condition on the design matrix. Under a stronger coherence condition, we derive bounds for the estimation error for mixed (2, p)-norms with 1 ≤ p ≤ ∞. When p = ∞, this result implies that a threshold version of the Group Lasso estimator selects the sparsity pattern of β∗ with high probability. Next, we prove that the rate of convergence of our upper bounds is optimal in a minimax sense, up to a logarithmic factor, for all estimators over a class of group sparse vectors. Furthermore, we establish lower bounds for the prediction and l2 estimation errors of the usual Lasso estimator. Using this result, we demonstrate that the Group Lasso can achieve an improvement in the prediction and estimation properties as compared to the Lasso. An important application of our results is provided by the problem of estimating multiple regression equation simultaneously or multi-task learning. In this case, our result lead to refinements of the results in [22] and allow one to establish the quantitative advantage of the Group Lasso over the usual Lasso in the multi-task setting. Finally, within the same setting, we show how our results can be extended to more general noise distributions, of which we only require the fourth moment to be finite. To obtain this extension, we establish a new maximal moment inequality, which may be of independent interest.
منابع مشابه
ar X iv : 0 80 3 . 28 39 v 1 [ m at h . ST ] 1 9 M ar 2 00 8 AGGREGATION BY EXPONENTIAL WEIGHTING , SHARP ORACLE INEQUALITIES AND SPARSITY
We study the problem of aggregation under the squared loss in the model of regression with deterministic design. We obtain sharp PAC-Bayesian risk bounds for aggregates defined via exponential weights, under general assumptions on the distribution of errors and on the functions to aggregate. We then apply these results to derive sparsity oracle inequalities.
متن کاملSparse Estimation by Exponential Weighting
Consider a regression model with fixed design and Gaussian noise where the regression function can potentially be well approximated by a function that admits a sparse representation in a given dictionary. This paper resorts to exponential weights to exploit this underlying sparsity by implementing the principle of sparsity pattern aggregation. This model selection take on sparse estimation allo...
متن کاملSparse oracle inequalities for variable selection via regularized quantization
We give oracle inequalities on procedures which combines quantization and variable selection via a weighted Lasso k-means type algorithm. The results are derived for a general family of weights, which can be tuned to size the influence of the variables in different ways. Moreover, these theoretical guarantees are proved to adapt the corresponding sparsity of the optimal codebooks, suggesting th...
متن کاملModern statistical estimation via oracle inequalities
A number of fundamental results in modern statistical theory involve thresholding estimators. This survey paper aims at reconstructing the history of how thresholding rules came to be popular in statistics and describing, in a not overly technical way, the domain of their application. Two notions play a fundamental role in our narrative: sparsity and oracle inequalities. Sparsity is a property ...
متن کاملSparsity oracle inequalities for the Lasso
This paper studies oracle properties of !1-penalized least squares in nonparametric regression setting with random design. We show that the penalized least squares estimator satisfies sparsity oracle inequalities, i.e., bounds in terms of the number of non-zero components of the oracle vector. The results are valid even when the dimension of the model is (much) larger than the sample size and t...
متن کامل